List of Flash News about Andrej Karpathy
| Time | Details |
|---|---|
|
2025-10-26 16:24 |
PyTorch MPS addcmul_ Silent-Failure Bug on Non-Contiguous Tensors Flags AI Training Risk: What Traders Should Watch
According to @karpathy, a detailed debugging investigation traced a suspicious training loss curve to a PyTorch MPS backend issue where addcmul_ silently fails on non-contiguous output tensors in the Objective-C++ path, pointing to a correctness bug that does not throw errors during training; Source: @karpathy on X https://twitter.com/karpathy/status/1982483540899237981 and the referenced thread by @ElanaPearl https://x.com/ElanaPearl/status/1981389648695025849. For AI workflow reliability, this implies Mac Apple MPS-based training can yield incorrect results without explicit runtime alerts, directly impacting the integrity of model training and evaluation pipelines used by practitioners; Source: @karpathy on X https://twitter.com/karpathy/status/1982483540899237981 and @ElanaPearl on X https://x.com/ElanaPearl/status/1981389648695025849. For traders, treat this as a software reliability risk flag within the AI toolchain and monitor official PyTorch or Apple MPS updates and release notes that reference addcmul_ or non-contiguous tensor handling, as confirmed fixes would reduce operational uncertainty around AI workloads that markets track for sentiment; Source: @karpathy on X https://twitter.com/karpathy/status/1982483540899237981 and @ElanaPearl on X https://x.com/ElanaPearl/status/1981389648695025849. |
|
2025-10-21 15:59 |
Andrej Karpathy Unveils nanochat d32: $800 Synthetic-Data Custom LLM Identity and Script Release, Key Signals for AI Agent Builders
According to @karpathy, nanochat now carries a defined identity and can state its capabilities, including that it is nanochat d32 built by him with a reported $800 cost and weaker non-English proficiency, achieved via synthetic data generation, source: x.com/karpathy/status/1980508380860150038. He released an example script that demonstrates generating diverse synthetic conversations and mixing them into mid-training or SFT, stressing the importance of entropy to avoid repetitive datasets, source: x.com/karpathy/status/1980508380860150038. He adds that base LLMs lack inherent personality or self-knowledge and require explicitly bolted-on traits via curated synthetic data, source: x.com/karpathy/status/1980508380860150038. For traders, the disclosed $800 customization benchmark and open-source workflow provide concrete cost and process reference points for evaluating open-source AI agent development and adoption paths across AI-linked tokens and AI-exposed equities, source: twitter.com/karpathy/status/1980665134415802554. |
|
2025-10-20 22:13 |
Andrej Karpathy: DeepSeek-OCR Signals 4 Reasons Pixels May Beat Text Tokens for LLM Inputs — Efficiency, Shorter Context Windows, Bidirectional Attention, No Tokenizer
According to Andrej Karpathy, the DeepSeek-OCR paper is a strong OCR model and more importantly highlights why pixels might be superior to text tokens as inputs to large language models, emphasizing model efficiency and input fidelity, source: Andrej Karpathy on X, Oct 20, 2025. He states that rendering text to images and feeding pixels can deliver greater information compression, enabling shorter context windows and higher efficiency, source: Andrej Karpathy on X, Oct 20, 2025. He adds that pixel inputs provide a more general information stream that preserves formatting such as bold and color and allows arbitrary images alongside text, source: Andrej Karpathy on X, Oct 20, 2025. He argues that image inputs enable bidirectional attention by default instead of autoregressive attention at the input stage, which he characterizes as more powerful for processing, source: Andrej Karpathy on X, Oct 20, 2025. He advocates removing the tokenizer at input due to the complexity and risks of Unicode and byte encodings, including security or jailbreak issues such as continuation bytes and semantic mismatches for emojis, source: Andrej Karpathy on X, Oct 20, 2025. He frames OCR as one of many vision-to-text tasks and suggests many text-to-text tasks can be reframed as vision-to-text, while the reverse is not generally true, source: Andrej Karpathy on X, Oct 20, 2025. He proposes a practical setup where user messages are images while the assistant response remains text and notes outputting pixels is less obvious, and he mentions an urge to build an image-input-only version of nanochat while referencing the vLLM project, source: Andrej Karpathy on X, Oct 20, 2025. |
|
2025-10-16 00:14 |
Karpathy Unveils $1,000 nanochat d32: 33-Hour Train, CORE 0.31, GSM8K 20% — Watch AI Compute Tokens RNDR, AKT, TAO
According to @karpathy, the depth-32 nanochat d32 trained for about 33 hours at roughly $1,000 and showed consistent metric gains across pretraining, SFT, and RL (Source: Karpathy on X; Karpathy GitHub nanochat discussion). He reports a CORE score of 0.31 versus GPT-2 at about 0.26 and GSM8K improvement from around 8% to about 20%, indicating a notable uplift for a micro model (Source: Karpathy on X; Karpathy GitHub nanochat discussion). He cautions that nanochat costs $100–$1,000 to train and the $100 version is about 1/1000th the size of GPT-3, leading to frequent hallucinations and limited reliability compared to frontier LLMs, so user expectations should remain modest (Source: Karpathy on X). He adds that scripts including run1000 sh are available in the repo, he is temporarily hosting the model for testing, and he plans throughput tuning before possibly scaling to a larger tier (Source: Karpathy on X; Karpathy GitHub repository). For traders, decentralized GPU networks that market AI workload support such as Render (RNDR), Akash (AKT), and Bittensor (TAO) remain key watchlist names as open-source, low-cost training expands developer experimentation (Source: Render Network documentation; Akash Network documentation; Bittensor documentation). |
|
2025-10-13 15:16 |
Andrej Karpathy Releases nanochat: Train a ChatGPT-Style LLM in 4 Hours for about $100 on 8x H100, Setting Clear GPU Cost Benchmarks for Traders
According to @karpathy, nanochat is a minimal from-scratch full-stack pipeline that lets users train and serve a simple ChatGPT-like LLM via a single script on a cloud GPU and converse with it in a web UI in about 4 hours, enabling an end-to-end training and inference workflow. source: @karpathy. He specifies the codebase has about 8,000 lines and includes tokenizer training in Rust, pretraining on FineWeb with CORE evaluation, midtraining on SmolTalk and multiple-choice data with tool use, supervised fine-tuning, optional RL on GSM8K via GRPO, and an inference engine with KV cache, Python tool use, CLI, a ChatGPT-like web UI, plus an auto report card. source: @karpathy. Disclosed cost and timing benchmarks are about $100 for roughly 4 hours on an 8x H100 node and about $1000 for about 41.6 hours, with a 24-hour depth-30 run reaching MMLU in the 40s, ARC-Easy in the 70s, and GSM8K in the 20s. source: @karpathy. From these figures, the implied compute rate is roughly $3.1 per H100-hour (about $100 across 32 H100-hours) and about $3.0 per H100-hour at the longer run (about $1000 across 332.8 H100-hours), providing concrete GPU-hour cost benchmarks for trading models of AI training spend. source: @karpathy. He also notes that around 12 hours surpasses GPT-2 on the CORE metric and that capability improves with more training, positioning nanochat as a transparent strong-baseline stack and the capstone for LLM101n with potential as a research harness. source: @karpathy. For crypto market participants tracking AI infrastructure, these cost-performance disclosures offer reference points to assess demand for centralized cloud and decentralized GPU compute tied to open-source LLM training workflows. source: @karpathy. |
|
2025-10-09 00:10 |
Andrej Karpathy flags RLHF flaw: LLMs fear exceptions and calls for reward redesign in RL training
According to Andrej Karpathy, current reinforcement learning practices make LLMs mortally terrified of exceptions, and he argues exceptions are a normal part of a healthy development process, as stated on Twitter on Oct 9, 2025. Karpathy urged the community to sign his LLM welfare petition to improve rewards in cases of exceptions, as stated on Twitter on Oct 9, 2025. The post includes no references to cryptocurrencies, tokens, or market data, indicating no direct market update from the source, as stated on Twitter on Oct 9, 2025. |
|
2025-10-03 13:37 |
Karpathy: LLM Agent Coding Not Ready for Half of Professional Work Despite ~50% ‘Mostly Agent’ Poll Signal
According to Andrej Karpathy, an X poll he referenced showed roughly half of respondents reporting they mostly use agent‑mode coding, contrary to his expectation of 50 percent tab‑complete, 30 percent manual, 20 percent agent, source: Andrej Karpathy on X, Oct 3, 2025, https://x.com/karpathy/status/1974106507034964111; poll link https://x.com/karpathy/status/1973892769359056997. He states his own workflow is primarily tab completion and he turns it off when not useful, using agents mainly for boilerplate or unfamiliar stacks with substantial review and edits, source: Andrej Karpathy on X, Oct 3, 2025, https://x.com/karpathy/status/1974106507034964111. He warns that when tasks are deep, tangled, or off the data manifold, LLMs produce bloated code with subtle bugs, concluding agent mode is not ready to write about half of professional code, source: Andrej Karpathy on X, Oct 3, 2025, https://x.com/karpathy/status/1974106507034964111. He asked for a serious organization to rerun the poll, underscoring uncertainty around actual adoption rates, source: Andrej Karpathy on X, Oct 3, 2025, https://x.com/karpathy/status/1974106507034964111. There was no mention of cryptocurrencies or blockchain in his comments, source: Andrej Karpathy on X, Oct 3, 2025, https://x.com/karpathy/status/1974106507034964111. |
|
2025-10-01 19:22 |
Andrej Karpathy: Tinker Cuts LLM Post-Training Complexity to Under 10% and Keeps 90% Algorithmic Control for Faster Finetuning
According to @karpathy, Tinker allows researchers and developers to retain roughly 90% of algorithmic creative control over data, loss functions, and training algorithms while offloading infrastructure, forward and backward passes, and distributed training to the framework. Source: @karpathy on X, Oct 1, 2025, https://twitter.com/karpathy/status/1973468610917179630 According to @karpathy, Tinker reduces the typical complexity of LLM post-training to well below 10%, positioning it as a lower-friction alternative to common “upload your data, we’ll train your LLM” services. Source: @karpathy on X, Oct 1, 2025, https://twitter.com/karpathy/status/1973468610917179630 According to @karpathy, this “slice” of the post-training workflow both delegates heavy lifting and preserves majority control of data and algorithmic choices, which he views as a more effective trade-off for practitioners. Source: @karpathy on X, Oct 1, 2025, https://twitter.com/karpathy/status/1973468610917179630 According to @karpathy, finetuning is less about stylistic changes and more about narrowing task scope, where fine-tuned smaller LLMs can outperform and run faster than large models prompted with giant few-shot prompts when ample training examples exist. Source: @karpathy on X, Oct 1, 2025, https://twitter.com/karpathy/status/1973468610917179630 According to @karpathy, production LLM applications are increasingly DAG-based pipelines where some steps remain prompt-driven while many components work better as fine-tuned models, and Tinker makes these finetunes trivial for rapid experimentation. Source: @karpathy on X, Oct 1, 2025, https://twitter.com/karpathy/status/1973468610917179630; supporting reference: Thinky Machines post, https://x.com/thinkymachines/status/1973447428977336578 |
|
2025-09-13 16:08 |
Andrej Karpathy References GSM8K (2021) on X: AI Benchmark Signal and What Crypto Traders Should Watch
According to @karpathy, he resurfaced a paragraph from the 2021 GSM8K paper in a Sep 13, 2025 X post, highlighting ongoing attention to LLM reasoning evaluation (source: Andrej Karpathy, X post on Sep 13, 2025). GSM8K is a grade‑school math word‑problem benchmark designed to assess multi‑step reasoning in language models, making it a primary metric for tracking verified reasoning improvements (source: Cobbe et al., GSM8K paper, 2021). Because the post does not announce a new model, dataset, or benchmark score, there is no immediate, verifiable trading catalyst for AI‑linked crypto assets at this time (source: Andrej Karpathy, X post on Sep 13, 2025). Traders should wait for measurable GSM8K score gains or product release notes before positioning, as GSM8K is specifically used to quantify reasoning progress (source: Cobbe et al., GSM8K paper, 2021). |
|
2025-09-09 15:36 |
Apple Event 2025 Livestream at 10am: Key Time Cue for AAPL Traders Watching New iPhones
According to @karpathy, Apple’s iPhone event livestream is scheduled today at 10am, roughly 1.5 hours after his post time, giving AAPL traders a precise headline window to plan event-driven setups (source: @karpathy on X, Sep 9, 2025). He also notes he has watched every annual iPhone reveal since 2007 and hopes for an iPhone mini, though he does not expect it to appear (source: @karpathy on X, Sep 9, 2025). No cryptocurrencies are mentioned in the post, so there are no direct crypto-market cues from this source ahead of the stream (source: @karpathy on X, Sep 9, 2025). |
|
2025-09-05 17:38 |
Andrej Karpathy Praises OpenAI GPT-5 Pro Code Generation: Key Trading Signals for AI and Crypto Markets
According to @karpathy, OpenAI’s GPT-5 Pro solved a complex coding task by returning working code after about 10 minutes, following roughly an hour of intermittent attempts with “CC” that did not succeed, indicating a strong qualitative performance on difficult problems. Source: @karpathy (X, Sep 5, 2025). He adds that he had “CC” read the GPT-5 Pro output and it produced two paragraphs admiring the solution, reinforcing his positive assessment of GPT-5 Pro’s code-generation quality. Source: @karpathy (X, Sep 5, 2025). The post offers developer-level endorsement of GPT-5 Pro’s coding capability but provides no market reaction, price action, or product release details, so traders should treat it as a sentiment data point rather than a quantitative catalyst. Source: @karpathy (X, Sep 5, 2025). |
|
2025-08-28 18:07 |
Karpathy Flags LLM-First Data Interfaces: 5 Crypto Infrastructure Plays to Watch (RNDR, FIL, AR, GRT, FET)
According to @karpathy, transforming human knowledge, sensors, and actuators from human-first to LLM-first and LLM-legible interfaces is a high-potential area, with the example that every textbook PDF/EPUB could map to a perfect machine-legible representation for AI agents. Source: x.com/karpathy/status/1961128638725923119 For traders, this theme implies increased need for decentralized, scalable storage of machine-readable corpora, aligning with Filecoin’s content-addressed storage and retrieval model and Arweave’s permanent data storage guarantees. Sources: x.com/karpathy/status/1961128638725923119; docs.filecoin.io; docs.arweave.org LLM-first pipelines also require indexing and semantic querying layers, mirroring The Graph’s subgraph architecture that makes structured data queryable for applications. Sources: x.com/karpathy/status/1961128638725923119; thegraph.com/docs Serving and training LLMs and agentic workloads depend on distributed GPU compute, directly mapped to Render Network’s decentralized GPU marketplace. Sources: x.com/karpathy/status/1961128638725923119; docs.rendernetwork.com Agentic interaction with sensors/actuators points to on-chain agent frameworks and microtransaction rails, a design space covered by Fetch.ai’s autonomous agent tooling. Sources: x.com/karpathy/status/1961128638725923119; docs.fetch.ai |
|
2025-08-24 19:46 |
Andrej Karpathy Reveals 75% Bread-and-Butter LLM Coding Flow and Diversified Workflows — Signal for AI Traders in 2025
According to @karpathy, his LLM-assisted coding usage is diversifying across multiple workflows that he stitches together rather than relying on a single perfect setup, source: @karpathy on X, Aug 24, 2025. He notes a primary bread-and-butter flow accounts for roughly 75 percent of his usage, indicating a dominant main pipeline supplemented by secondary workflows, source: @karpathy on X, Aug 24, 2025. The post frames this as part of his ongoing pursuit of an optimal LLM-assisted coding experience, source: @karpathy on X, Aug 24, 2025. The post does not name any tools, products, benchmarks, tickers, or cryptocurrencies and provides no quantitative performance data or market impact, source: @karpathy on X, Aug 24, 2025. |
|
2025-06-20 21:18 |
Highest Grade LLM Pretraining Data: Andrej Karpathy Analyzes Textbook-Like Content and AI Model Samples for Optimal Quality
According to Andrej Karpathy on Twitter, the ideal pretraining data stream for large language model (LLM) training, when focusing solely on quality, could resemble highly curated textbook-like content in markdown or even samples generated from advanced AI models. This insight is highly relevant for traders as the evolution of AI training methods can lead to substantial improvements in AI-driven crypto trading algorithms, potentially impacting the volatility and efficiency of cryptocurrency markets (source: @karpathy, Twitter, June 20, 2025). |
|
2025-06-19 19:19 |
GUI for LLMs Demo by Andrej Karpathy Highlights Ephemeral UI Generation and Its Impact on Crypto and AI Markets
According to Andrej Karpathy, a new demo showcases a GUI for large language models (LLMs) that dynamically generates ephemeral user interfaces tailored to specific tasks, as reported via Twitter on June 19, 2025. This innovation signals a shift in AI application design, potentially accelerating adoption in decentralized app (dApp) interfaces and blockchain-based platforms. For traders, this could impact demand for AI-integrated crypto tokens and projects leveraging LLMs, especially those focused on user experience and automation in the DeFi sector (source: @karpathy). |
|
2025-06-19 02:05 |
How Andrej Karpathy’s LLM Research and Software 2.0 Vision Impact Crypto Trading and Blockchain Innovation
According to Andrej Karpathy (@karpathy), recent advancements in large language models (LLMs) and the Software 2.0 paradigm are fundamentally accelerating technology diffusion and automation in software development (source: Karpathy, Twitter, June 19, 2025; slides, blog post). For crypto traders, this rapid evolution signals increased adoption of AI-driven protocols, enhanced smart contract automation, and new DeFi trading strategies powered by generative AI. The referenced materials provide actionable insights for traders seeking to leverage AI advancements for automated trading, improved risk management, and the identification of innovative blockchain projects integrating LLM-driven solutions. |
|
2025-06-19 02:01 |
Andrej Karpathy Highlights AI Startup School Impact: LLMs Revolutionizing Software in 2025
According to Andrej Karpathy, LLMs are fundamentally transforming the software landscape by enabling programming in natural English, representing a major version upgrade for computer technology (source: Twitter @karpathy, June 19, 2025). This paradigm shift in AI development is poised to drive innovation across crypto and blockchain sectors, as more projects leverage LLMs to enhance smart contract automation and DeFi protocols. Traders should closely monitor cryptocurrencies and tokens related to AI infrastructure, as advancements in large language models are likely to accelerate adoption and value creation within the crypto market. |
|
2025-06-17 20:38 |
YC AI Startup School 2025 Recordings to Offer Key Insights for Crypto Traders and Builders
According to Andrej Karpathy, the YC AI Startup School 2025 event recordings will be released in the coming weeks, providing valuable insights for crypto traders and AI-focused blockchain projects. The event, organized by Y Combinator, brought together top AI builders and innovators, potentially influencing trends in AI-driven crypto trading strategies and blockchain technology adoption (Source: @karpathy on Twitter, June 17, 2025). Traders should watch for the release as it may offer actionable information on integrating AI with crypto trading and project development. |
|
2025-06-16 17:02 |
LLM Agent Security Risks: Trading Implications for Crypto Investors – Insights from Andrej Karpathy
According to Andrej Karpathy on Twitter, the security risk is highest when running local LLM agents such as Cursor and Claude Code, while interacting with LLMs on web platforms like ChatGPT presents a much lower risk unless advanced features like Connectors are enabled. For crypto traders, this distinction is critical as compromised local agents could expose sensitive trading data or private keys, increasing the risk of wallet breaches or unauthorized transactions (source: @karpathy, June 16, 2025). As AI tools become more integrated into crypto trading workflows, users should carefully manage permissions and avoid enabling Connectors unless absolutely necessary to mitigate cybersecurity threats. |
|
2025-06-16 16:37 |
Prompt Injection Attacks in LLMs: Growing Threats and Crypto Market Security Risks in 2025
According to Andrej Karpathy on Twitter, prompt injection attacks targeting large language models (LLMs) are emerging as a major cybersecurity concern in 2025, reminiscent of the early days of computer viruses. Karpathy highlights that malicious prompts hidden in web data and tools lack robust defenses, increasing vulnerability for AI-integrated platforms. For crypto traders, this raises urgent concerns about the security of AI-driven trading bots and DeFi platforms, as prompt injection could lead to unauthorized transactions or data breaches. Traders should closely monitor their AI-powered tools and ensure rigorous security protocols are in place, as the lack of mature 'antivirus' solutions for LLMs could impact the integrity of crypto operations. (Source: Andrej Karpathy, Twitter, June 16, 2025) |